Tensor product

In mathematics, the tensor product, denoted by ⊗, may be applied in different contexts to vectors, matrices, tensors, vector spaces, algebras, topological vector spaces, and modules, among many other structures or objects. In each case the significance of the symbol is the same: the most general bilinear operation. In some contexts, this product is also referred to as outer product. The term "tensor product" is also used in relation to monoidal categories.

Contents

Tensor product of vector spaces

The tensor product V ⊗ W of two vector spaces V and W over a field K can be defined by the method of generators and relations.

To construct V ⊗ W, one begins with the set of ordered pairs in the Cartesian product V × W. For the purposes of this construction, regard this Cartesian product as a set rather than a vector space. The free vector space F on V × W is defined by taking the vector space in which the elements of V × W are a basis. In set-builder notation,

F(V\times W) = \left\{\sum_{i=1}^n \alpha_i e_{(v_i, w_i)} \ \Bigg| \ n\in\mathbb{N}, \alpha_i\in K, (v_i, w_i)\in V\times W \right\},

where we have used the symbol e(v,w) to emphasize that these are taken to be linearly independent by definition for distinct (vw) ∈ V × W.

The tensor product arises by defining the following four equivalence relations in F(V × W):

\begin{align}
e_{(v_1 %2B v_2, w)} &\sim e_{(v_1, w)} %2B e_{(v_2, w)}\\
e_{(v, w_1 %2B w_2)} &\sim e_{(v, w_1)} %2B e_{(v, w_2)}\\
ce_{(v, w)} &\sim e_{(cv, w)} \sim e_{(v, cw)}
\end{align}

where v1, v2, and w1, w2 are vectors from V and W (respectively), and c is from the underlying field K. Denoting by R the space generated by these four equivalence relations, the tensor product of the two vector spaces V and W is then the quotient space

V \otimes W = F(V \times W) / R.

It is also called the tensor product space of V and W and is a vector space (which can be verified by directly checking the vector space axioms). The tensor product of two elements v and w is the equivalence class (e(v,w) + R) of e(v,w) in V ⊗ W, denoted v ⊗ w. This notation can somewhat obscure the fact that tensors are always cosets: manipulations performed via the representatives (v,w) must always be checked that they do not depend on the particular choice of representative.

The space R is mapped to zero in V ⊗ W, so that the above three equivalence relations become equalities in the tensor product space:

\begin{align}
(v_1 %2B v_2) \otimes w &= v_1 \otimes w %2B v_2 \otimes w;\\
v \otimes (w_1 %2B w_2) &= v \otimes w_1 %2B v \otimes w_2;\\
         cv \otimes w &= v \otimes cw = c(v \otimes w).
\end{align}

Given bases {vi} and {wi} for V and W respectively, the tensors {vi ⊗ wj} form a basis for V ⊗ W (generally ordered so that vi ⊗ wj+1 comes before vi+1 ⊗ wj). The dimension of the tensor product therefore is the product of dimensions of the original spaces; for instance Rm ⊗ Rn will have dimension mn.

Elements of V ⊗ W are sometimes referred to as tensors, although this term refers to many other related concepts as well.[1] An element of V ⊗ W of the form v ⊗ w is called a pure or simple tensor. In general, an element of the tensor product space is not a pure tensor, but rather a finite linear combination of pure tensors. That is to say, if v1 and v2 are linearly independent, and w1 and w2 are also linearly independent, then v1 ⊗ w1 + v2 ⊗ w2 cannot be written as a pure tensor. The number of simple tensors required to express an element of a tensor product is called the tensor rank, (not to be confused with tensor order, which is the number of spaces one has taken the product of, in this case 2; in notation, the number of indices) and for linear operators or matrices, thought of as (1,1) tensors (elements of the space V ⊗ V*), it agrees with matrix rank.

Characterization by a universal property

The tensor product of V and W can be defined (up to isomorphism) by any pair (L, φ), with L a vector space on K and \varphi:V\times W\to L a bilinear map such that

for any K-vector space Z and any bilinear map h:V\times W\to Z, there exists a unique linear map \tilde{h}:L\to Z verifying  h=\tilde{h}\circ\varphi

In this sense, φ is the most general bilinear map that can be built from VxW.

It is easy to check that (V\otimes W, \; \otimes) satisfies this universal property.

As a functor

The tensor product also operates on linear maps between vector spaces. Specifically, given two linear maps S : VX and T : WY between vector spaces, the tensor product of the two linear maps S and T is a linear map

S\otimes T:V\otimes W\rightarrow X\otimes Y

defined by

(S\otimes T)(v\otimes w)=S(v)\otimes T(w).

In this way, the tensor product becomes a bifunctor from the category of vector spaces to itself, covariant in both arguments.[2]

The Kronecker product of two matrices is the matrix of the tensor product of the two corresponding linear maps under a standard choice of bases of the tensor products (see the article on Kronecker products).

More than two vector spaces

The construction and the universal property of the tensor product can be extended to allow for more than two vector spaces. For example, suppose that V1, V2, and V3 are three vector spaces. The tensor product V1 ⊗ V2 ⊗ V3 is defined along with a trilinear mapping from the direct product

\varphi�: V_1\times V_2\times V_3 \to V_1\otimes V_2\otimes V_3

so that, any trilinear map F from the direct product to a vector space W

F:V_1\times V_2\times V_3\to W

factors uniquely as

F = L\circ\varphi

where L is a linear map. The tensor product is uniquely characterized by this property, up to a unique isomorphism.

This construction is related to repeated tensor products of two spaces. For example, if V1, V2, and V3 are three vector spaces, then there are (natural) isomorphisms

V_1\otimes V_2\otimes V_3\cong V_1\otimes(V_2\otimes V_3)\cong (V_1\otimes V_2)\otimes V_3.

More generally, the tensor product of an arbitrary indexed family Vi, i ∈ I, is defined to be universal with respect to multilinear mappings of the direct product \scriptstyle\prod_{i\in I} V_i.

Tensor powers and braiding

Let n be a non-negative integer. The nth tensor power of the vector space V is the n-fold tensor product of V with itself. That is

V^{\otimes n} \;\overset{\mathrm{def}}{=}\; \underbrace{V\otimes\cdots\otimes V}_{n}.

A permutation σ of the set {1, 2, ..., n} determines a mapping of the nth Cartesian power of V

\sigma�: V^n\to V^n

defined by

\sigma(v_1,v_2,\dots,v_n) = (v_{\sigma 1}, v_{\sigma 2},\dots,v_{\sigma n}).

Let

\varphi:V^n \to V^{\otimes n}

be the natural multilinear embedding of the Cartesian power of V into the tensor power of V. Then, by the universal property, there is a unique morphism

\tau_\sigma�: V^{\otimes n} \to V^n

such that

\sigma = \tau_\sigma\circ\varphi.

The morphism τσ is called the braiding map associated to the permutation σ.

Tensor product of two tensors

A tensor on V is an element of a vector space of the form

 \begin{matrix} T^r_s(V) & = & \underbrace{ V\otimes \dots \otimes V} & \otimes  & \underbrace{ V^*\otimes \dots \otimes V^*} & = & V^{\otimes r}\otimes V^{*\otimes s}\\ & & r & & s \end{matrix}

for non-negative integers r and s. There is a general formula for the components of a (tensor) product of two (or more) tensors. For example, if F and G are two covariant tensors of rank m and n (respectively) (i.e. FTm0, and GTn0), then the components of their tensor product are given by

(F\otimes G)_{i_1i_2...i_{m%2Bn}} = F_{i_{1}i_{2}...i_{m}}G_{i_{m%2B1}i_{m%2B2}i_{m%2B3}...i_{m%2Bn}}.[3]

In this example, it is assumed that there is a chosen basis B of the vector space V, and the basis on any tensor space Tsr is tacitly assumed to be the standard one (this basis is described in the article on Kronecker products). Thus, the components of the tensor product of two tensors are the ordinary product of the components of each tensor.

Note that in the tensor product, the factor F consumes the first rank(F) indices, and the factor G consumes the next rank(G) indices, so

\mathrm{rank}( F \otimes G )=\mathrm{rank}(F)%2B\mathrm{rank}(G).

The tensor  \scriptstyle T^r_s(V) may be naturally viewed as a module for the Lie algebra  \scriptstyle \mathrm{End}(V) by means of the diagonal action: for simplicity let us assume \scriptstyle r = s = 1, then, for each \scriptstyle u \in\mathrm{End}(V) ,

 u(a \otimes b)  = u(a) \otimes b - a \otimes u^*(b),

where \scriptstyle u^* \in \mathrm{End}(V^*) is the transpose of \scriptstyle u, that is, in terms of the obvious pairing on \scriptstyle V \otimes V^*,

\langle u(a), b \rangle = \langle a, u^*(b) \rangle.

There is a canonical isomorphism \scriptstyle T^1_1(V) \rightarrow \mathrm{End}(V) given by

(a \otimes b)(x) = \langle x, b \rangle a.

Under this isomorphism, every \scriptstyle u \in\mathrm{End}(V) may be first viewed as an endomorphism of \scriptstyle T^1_1(V) and then viewed as an endomorphism of  \scriptstyle\mathrm{End}(V) . In fact it is the adjoint representation \scriptstyle\mathrm{ad} (u) of \mathrm{End}(V).

Example

Let U be a tensor of type (1,1) with components Uαβ, and let V be a tensor of type (1,0) with components Vγ. Then

 U^\alpha {}_\beta V^\gamma = (U \otimes V)^\alpha {}_\beta {}^\gamma

and

 V^\mu U^\nu {}_\sigma = (V \otimes U)^{\mu \nu} {}_\sigma.

The tensor product inherits all the indices of its factors.

Kronecker product of two matrices

With matrices this operation is usually called the Kronecker product, a term used to make clear that the result has a particular block structure imposed upon it, in which each element of the first matrix is replaced by the second matrix, scaled by that element. For matrices \scriptstyle U and \scriptstyle V this is:

U \otimes V
        = \begin{bmatrix} u_{1,1}V & u_{1,2}V & \cdots \\
                          u_{2,1}V & u_{2,2}V \\
                          \vdots  &         & \ddots
          \end{bmatrix}
 = \begin{bmatrix}
       u_{1,1}v_{1,1} & u_{1,1}v_{1,2} & \cdots & u_{1,2}v_{1,1} & u_{1,2}v_{1,2} & \cdots \\
       u_{1,1}v_{2,1} & u_{1,1}v_{2,2} &        & u_{1,2}v_{2,1} & u_{1,2}v_{2,2} \\
       \vdots       &              & \ddots \\
       u_{2,1}v_{1,1} & u_{2,1}v_{1,2} \\
       u_{2,1}v_{2,1} & u_{2,1}v_{2,2} \\
       \vdots
   \end{bmatrix}.

For example, the tensor product of two two-dimensional square matrices:


  \begin{bmatrix} 
    a_{1,1} & a_{1,2} \\ 
    a_{2,1} & a_{2,2} \\ 
  \end{bmatrix}
\otimes
  \begin{bmatrix} 
    b_{1,1} & b_{1,2} \\ 
    b_{2,1} & b_{2,2} \\ 
  \end{bmatrix}
=
  \begin{bmatrix} 
    a_{1,1}  \begin{bmatrix} 
              b_{1,1} & b_{1,2} \\ 
              b_{2,1} & b_{2,2} \\ 
            \end{bmatrix} & a_{1,2}  \begin{bmatrix} 
                                      b_{1,1} & b_{1,2} \\ 
                                      b_{2,1} & b_{2,2} \\ 
                                    \end{bmatrix} \\ 
     & \\
    a_{2,1}  \begin{bmatrix} 
              b_{1,1} & b_{1,2} \\ 
              b_{2,1} & b_{2,2} \\ 
            \end{bmatrix} & a_{2,2}  \begin{bmatrix} 
                                      b_{1,1} & b_{1,2} \\ 
                                      b_{2,1} & b_{2,2} \\ 
                                    \end{bmatrix} \\ 
  \end{bmatrix}
=
  \begin{bmatrix} 
    a_{1,1} b_{1,1} & a_{1,1} b_{1,2} & a_{1,2} b_{1,1} & a_{1,2} b_{1,2} \\ 
    a_{1,1} b_{2,1} & a_{1,1} b_{2,2} & a_{1,2} b_{2,1} & a_{1,2} b_{2,2} \\ 
    a_{2,1} b_{1,1} & a_{2,1} b_{1,2} & a_{2,2} b_{1,1} & a_{2,2} b_{1,2} \\ 
    a_{2,1} b_{2,1} & a_{2,1} b_{2,2} & a_{2,2} b_{2,1} & a_{2,2} b_{2,2} \\ 
  \end{bmatrix}.

The resultant rank is at most 4, and the resultant dimension 16. Here rank denotes the tensor rank (number of requisite indices), while the matrix rank counts the number of degrees of freedom in the resulting array.

A representative case is the Kronecker product of any two rectangular arrays, considered as matrices. A dyadic product is the special case of the tensor product between two vectors of the same dimension.

Tensor product of multilinear maps

Given multilinear maps \scriptstyle f (x_1,\dots,x_k) and \scriptstyle g (x_1,\dots, x_m) their tensor product is the multilinear function

 (f \otimes g) (x_1,\dots,x_{k%2Bm}) = f(x_1,\dots,x_k) g(x_{k%2B1},\dots,x_{k%2Bm}).

Relation with the dual space

In the discussion on the universal property, replacing X by the underlying scalar field of V and W yields that the space \scriptstyle (V \otimes W)^* (the dual space of \scriptstyle V \otimes W, containing all linear functionals on that space) is naturally identified with the space of all bilinear functionals on \scriptstyle V \times W. In other words, every bilinear functional is a functional on the tensor product, and vice versa.

Whenever \scriptstyle V and \scriptstyle W are finite dimensional, there is a natural isomorphism between \scriptstyle V^* \otimes W^* and \scriptstyle (V \otimes W)^*, whereas for vector spaces of arbitrary dimension we only have an inclusion \scriptstyle V^* \otimes W^*\subset (V \otimes W)^*. So, the tensors of the linear functionals are bilinear functionals. This gives us a new way to look at the space of bilinear functionals, as a tensor product itself.

Types of tensors

Linear subspaces of the bilinear operators (or in general, multilinear operators) determine natural quotient spaces of the tensor space, which are frequently useful. See wedge product for the first major example. Another would be the treatment of algebraic forms as symmetric tensors.

Over more general rings

The notation ⊗R refers to a tensor product of modules over a ring R.

Tensor product for computer programmers

Array programming languages

Array programming languages may have this pattern built in. For example, in APL the tensor product is expressed as \scriptstyle\circ . \times (for example \scriptstyle A \circ . \times B or \scriptstyle A \circ . \times B \circ . \times C). In J the tensor product is the dyadic form of */ (for example a */ b or a */ b */ c).

Note that J's treatment also allows the representation of some tensor fields, as a and b may be functions instead of constants. This product of two functions is a derived function, and if a and b are differentiable, then a*/b is differentiable.

However, these kinds of notation are not universally present in array languages. Other array languages may require explicit treatment of indices (for example, MATLAB), and/or may not support higher-order functions such as the Jacobian derivative (for example, Fortran/APL).

See also

Notes

  1. ^ See tensor or tensor (intrinsic definition).
  2. ^ Hazewinkel, et. al. (2004), p. 100.
  3. ^ Analogous formulas also hold for contravariant tensors, as well as tensors of mixed variance. Although in many cases such as when there is an inner product defined, the distinction is irrelevant.

References